Synthetic speech sound generator



Aug. 25, 1964 e. L. CLAPPER 3,146,309

SYNTHETIC SPEECH SOUND GENERATOR Filed Dec. 18, 1961 -5 0| s2 53 54 65 ,ee

I COMP. 0 GATES S LS s0 0P ST 00-1 00-2 oo-s 00-4 00-5 00-0 V 0 r r r r r SOUELCH 0 v v v v v 0A\ INTENSITY CONTROL L CM HISS SIGNAL SAWTOOTH SIGNAL GENERATOR GENERATOR FROM 002-000 HM OGC 00(BlAs CONTROL) TO M 0v I 0v 010m L c1 GATE 7 0s HlSS W0jij 11111111111000 ST SAWTO0TH 001 OUTPUT WWW 001 OUTPUTW FIG. 3 FIG. 4

llVVE/VTOR GENUNG L. CLAPPER g (i /MM AGENT United States Patent 3,146,309 SYNTHETIC fiPEECI-I SOUND GENERATOR Genung L. Clapper, Vestal, N.Y., assignor to International Business Machines Corporation, New York, N.Y., a corporation of New York Filed Dec. 18, 1961, Ser. No. 160,207 4 Claims. (Cl. 179-1) This invention relates to sound generators and particularly to sound generators useful in synthesizing human speech.

It has been found by experiment that all speech sounds are characterized by certain resonant conditions. It is also generally known that the voiced sounds are produced by vibrations of the vocal cords in the pharyngeal air column to produce a pressure-versus-time pattern which is of a sawtooth variety. The coarse buzzing sound produced by the vocal cords is resonated at various frequencies dependent upon the then existing resonance conditions in the oral, nasal and pharyngeal cavities.

In similar manner, the fricative sounds are produced by constricting the free flow of air at some point in the mouth or throat, producing a relatively high pitched noise or hiss which also resonates certain of the speech cavities depending upon the method of production, thus an f sound produced on the lips is always higher in pitch than the k sound that is produced toward the back of the mouth, for example. Also, the k sound in the word key is higher in frequency than the k sound in the word cool, because of the different resonance conditions set up by the succeeding vowel sound. It is reasonable, therefore, to assume that in a speech synthesis system, the fricative or sibilant sounds should be modulated by a frequency which would represent the resonant vocal condition.

A synthesis of vowel sounds can be achieved by mixing the outputs of selected sine Wave oscillators. However, the resultant sound when audibly reproduced is too musical for maximum intelligibility. Also, since the oscillators are free running, random beat notes will be produced and synchronization of the oscillators is a very difficult problem. The present invention has, therefore, for its principal object the provision of a new and improved form of synthetic speech sound generator which very closely emulates the manner in which human voice sounds are produced.

Another object of the invention is to provide a new and improved synthetic speech sound generator which uses a minimum number of reliable but inexpensive reactive components to provide an economical system.

A further object of the invention is to provide an improved synthetic speech sound generator utilizing a circuit arrangement which is normally quiescent but when activated produces an output at a selected frequency, modulated in accordance with the waveform of an input signal.

Still another object of the invention is to provide an improved synthetic speech sound generator utilizing a combination of circuits which, in their gated state, stand almost on the verge of oscillation, and which will oscillate in a manner determined to a large extent by the presence of another signal so that the output comprises a signal having a predetermined base or carrier frequency but Which may be modulated by either one of two conditions to represent either a hiss type noise or a sawtooth or buzz type noise which, as previously described, are found in the human voice.

Briefly described, the speech sound generator according to the present invention constitutes a modified form of the well known phase shift oscillator. In this arrangement, a suitable amplifying stage is provided with a feedback path including a combination of resistive and 3,146,309 Patented Aug. 25, 1964 capacitive elements which, when supplied through a suitable impedance matching source back to the input of the amplifier stage having a suitable gain, will cause oscillations to take place therein provided that the feedback is of the proper amplitude and phase relationship. However, a gain control is provided in the circuit which brings the total loop gain to a point just less than unity so that losses in the loop are provided for but there is insufiicient feedback to cause sustained oscillation. Upon the supply to the circuit of an additional gating impulse, the gain of the loop is sufficiently increased so that any input signals supplied to an input of an oscillator circuit will be fed back in sufficient magnitude to cause a sustained oscillation. Since the frequency of the oscillation is determined by the reactive network included therein and the input may consist either of a noise type source or a sawtooth type source, the output of the oscillator will comprise a modulated wave which has either the form of a damped train of oscillations when the generator is modulated by a sawtooth or will consist of a sine wave having a superimposed high frequency modulation thereon when the input to the generatorconsists of the hiss or noise frequencies. The basic oscillating frequency will be the same in both cases as determined by the reactive feedback network. In this manner, the generator acts the same as the resonant cavity acts when supplied with an external stimulating signal.

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of a preferred embodiment of the invention, as illustrated in the accompanying drawings.

In the drawings:

FIG. 1 is a block diagram of a portion of a speech synthesizer incorporating a preferred embodiment of the present invention.

FIG. 2 is a detailed circuit diagram of the portion of the system of FIG. 1, illustrating the details of the quasi oscillators or speech resonators, and the compensation circuit employed therewith.

FIGS. 3 and 4 are illustrative waveform diagrams showing the operation of the apparatus under various conditions.

Referring to the drawings, FIG. 1 shows, in schematic form, the portion of the speech synthesizing system, including a preferred embodiment of the present invention. Six quasi oscillators or sound generators designated as QO-l through QO6 are provided, each of which has a Corresponding gate input line designated by the reference characters G1 through G6, connected to the quasi oscillators as shown. Signals supplied over lines G1 through G6 are thus selectively applied to the gate inputs of the associated quasi oscillator.

The signals supplied over lines G1 through G6 are governed by gating means such as the gates 5 in accordance with the nature of the spe'ech sound which is to be reproduced. Each of the quasi oscillators QO-1, QO2, QO3 and Q04 are also supplied with a hiss signal input from a terminal HS, this hiss signal being generated by a high frequency source such as hiss generator 7 which is also governed by means not shown for providing the hiss component of the synthesized speech. All of the quasi oscillators QO-l through QO6 are also connected to a terminal ST, which in turn is supplied from a sawtooth signal generator 9, which provides a sawtooth signal waveform, rising from a low level in relatively short time to a higher level and then decreasing at a given rate until the lower level is reached whereupon the cycle is repeated.

The outputs from the quasi oscillators QO-l through QO-6 are each connected to a common output line CM, which is connected to a squelch and intensity control circuit having a set of inputs Cl and S9, the sources of which are not shown but which serve to control the squelching and intensity of the resultant output signals supplied from the quasi oscillators. The details of the squelch and intensity controls, as well as the sources of the signals for controlling this portion of the apparatus, are not shown since they form no portion of the present invention and are set forth only in order to illustrate the background in which the invention may be employed. Since the DC. output level on the common output line CM could vary in accordance with a number of oscillators which are operating, a D.C. compensating circuit is provided which has inputs from each of the lines G1 through G6, as shown, and has an output line which serves to maintain the output D.C. level constant on the common line CM, regardless of the number of oscillators which are active. The common output line CM is connected by way of an output gain control OGC to an output amplifier OA, and thence to a loud speaker LS to audibly reproduce the signals.

Referring now to FIG. 2 of the drawings, each of the quasi oscillators comprises a circuit such as the one shown including a pair of amplifying devices such as the transistors T1 and T2, here shown as being of the PNP variety. Transistor T1 is connected as a grounded emitter stage with voltage gain between the input at the base thereof and the output from the collector. The emitter of transistor T1 is connected to ground and the collector thereof is connected via a fixed resistor R1 and a variable gain control resistor CC to a source of negative potential indicated as -12. The transistor T2 is connected as a grounded collector stage having power gain but having less than unity voltage gain. The base of transistor T2 is connected to the collector circuit of transistor T1, the emitter being supplied with energy from a source, not shown, having a terminal +6 connected to the emitter of transistor T2 through a fixed resistor R2, the collector of transistor T2 being connected to the common output line CM. The emitter of transistor T2 is connected to the base of transistor TI via a phase shifting network comprising resistors R3, R4 and R5, and capacitors C1, C2 and C3, connected in the usual and well-known manner provided in phase shift oscillators. The gain control adjustment C for each of the quasi oscillators such as the one shown in FIG. 2 is adjusted to bring the total loop gain to a point just under unity so that the voltage gain of the transistor T1 just compensates for losses in the feedback loop including the phase shifting network. Under these conditions, of course, the circuit is on the point of oscillation but will not oscillate unless energy is supplied thereto from an external source.

The gate control line associated with the particular quasi oscillator such as G1 in FIG. 2 is connected to the phase shift circuit and the base of transistor T1 through a resistor R6. The gate lines are normally at a down level such as volts, for example, and when activated are raised to a less negative level such as zero voltage. When at the negative level, the negative voltage reduces the gain of the loop by changing the DC. operating point of transistor T1 so that the transistor cannot possibly oscillate. The sawtooth and hiss signal inputs are supplied to the circuit connecting the collector of transistor T1 and the base of transistor T2, the sawtooth signal terminal ST being connected via a capacitor C4 and the hiss signal terminal HS being connected via a resistor R7. The parts are proportioned and arranged so that with the gate signal G1 at its normal or down level, the transistor T1 is saturated, and even though input signals are present on lines ST or HS, these signals will not be passed through T2 to the common output because of the clamping action of saturated transistor T1. When gate G1 is raised to its on level (zero volts) the transistor T1 goes out of saturation and the appropriate input signal from line ST or HS is supplied to transistor TR2 and thence back through the phase shift network to the base of transistor T1.

With gate line G1 raised to its on level, the circiut is now in condition to oscillate since the added energy of the input has been added into the total loop gain. Thus, a gated oscillator will be excited to produce either a modulated sibilant sound if the hiss input is effective or will produce a damped train of oscillations if the sawtooth input is effective.

The operation under these two conditions is shown by the waveforms illustrated in FIGS. 3 and 4. In FIG. 3, when the gate G1 goes up, and the hiss signal HS is supplied to the quasi oscillator (20-1, the output will be a sine wave signal at the frequency determined by the phase shifting network associated with the quasi oscillator QO-l modulated by the hiss signal thereon as shown by the waveform for the QO-1 output. In FIG. 4, when the gate signal G1 goes up, the sawtooth signal ST will cause the quasi oscillator to produce a damped train of oscillations, having the fundamental frequency determined by the phase shifting network, but decreasing in amplitude in the proportion to the decrease in the sawtooth waveform and then increasing as the sawtooth waveform takes on its higher value.

It is thus apparent how one relatively simple oscillator circuit using inexpensive reactive components can be made to function in the same manner as a resonant cavity acting with an external stimulus to produce two entirely different waveforms depending on the excitation but having the same basic oscillating frequency.

The output of the oscillators, as previously pointed out, is supplied to a common output line CM to produce a complex Waveform by having the other oscillator outputs added thereto, which complex waveform appears across the load constituting the output gain control OGC. The DC. compensating network described in FIG. 1 is shown in detail in FIG. 2, and comprises a resistance summing network including the resistors RG1 through RG6, connected respectively to the gate lines G1 through G6, and a common load resistor R8. The common connection of the resistors RG1 through RG6 is connected to the base of a PNP transistor T3, the emitter of which is connected to ground through an adjusted resistor R9, and the collector of which is connected to the common output line for the quasi oscillators, CM. The DC. compensating network operates to subtract a unit of current for each oscillator which is gated on, since the change in the DC. operating level to bring an oscillator on results in a change in the DC. output of the transistor T2 in each of the quasi oscillators. The compensating circuit network maintains the DC. level of the common oscillator output line CM at a fixed value for any number or combination of oscillators which are gated on. The audio signal developed across the output gain control OGC is applied to a power amplifier 0A and thence to a loud speaker LS whereby the complex electric waveform produced by the speech resonating circuits including the quasi oscillators QO-l through QO-6 is converted to audible sound.

From the foregoing, it will be apparent that the generator provided by the subject invention constitutes an economical and reliable device for alternatively generating two varieties of specialized waveforms useful in speech synthesis.

While the invention has been patricularly shown and described with reference to a preferred embodiment thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

What is claimed is:

1. In combination,

(a) a phase shift oscillator including two cascade connected amplifying devices and a reactive network for providing a feedback connection between the amplifying devices,

(b) biasing means for adjusting the operating point of said oscillator to a point less than that required for oscillation,

(c) gating means for supplying a gating signal to said oscillator,

(d) modulating signal means for supplying a modulating signal to said oscillator,

(e) said gating means and said modulating signal means being eifective to cause said oscillator to oscillate at a predetermined base frequency, modulated in accordance with the characteristics of said modulating signal,

(1) and an output circuit connected to said oscillator.

2. The combination recited in claim 1, in which the gating means is connected to the reactive network, and the modulating signal means are connected to the cascade connection of the amplifying devices.

3. In a speech synthesis system, means for simulating the resonant conditions of human oral cavities in modifying speech sounds comprising, in combination,

(a) a plurality of phase-shift oscillators each tuned to oscillate at a corresponding one of a plurality of base frequencies, each of said oscillators including a pair of cascade connected amplifying devices and a feedback circuit including a reactive network,

([2) biasing means for each of said oscillators for adjusting the operating point of the associated oscillator to a point less than that required for oscillation,

(c) a plurality of gate signal means, one associated with each of said oscillators for supplying gate signals to the associated oscillator at predetermined times,

(d) a plurality of modulating signal means, each said modulating means being selectively connected to each of said oscillators at predetermined times,

(e) said gate signal means and said modulating signal means being effective to cause the associated oscillator to oscillate at its particular base frequency, modulated in accordance with the modulating signals supplied to the oscillator concurrently with the gating signal, and

(f) a common output line connected to said oscillators.

4. The combination recited in claim 3, further including 15 a compensating network for maintaining a constant direct current level on the common output line, comprising,

(a) a summing resistor network connected to said gate signal means, and

(b) a current conductive device connected to said summing network and having an output connected to said common output line, the parts being proportioned and arranged so that the output of said current conductive device maintains the D.C. level of said common output line at a constant level, in response to varying numbers of inputs to the summing network.

No references cited. 

1. IN COMBINATION, (A) A PHASE SHIFT OSCILLATOR INCLUDING TWO CASCADE CONNECTED AMPLIFYING DEVICES AND A REACTIVE NETWORK FOR PROVIDING A FEEDBACK CONNECTION BETWEEN THE AMPLIFYING DEVICES, (B) BIASING MEANS FOR ADJUSTING THE OPERATING POINT OF SAID OSCILLATOR TO A POINT LESS THAN THAT REQUIRED FOR OSCILLATION, (C) GATING MEANS FOR SUPPLYING A GATING SIGNAL TO SAID OSCILLATOR, (D) MODULATING SIGNAL MEANS FOR SUPPLYING A MODULATING SIGNAL TO SAID OSCILLATOR, (E) SAID GATING MEANS AND SAID MODULATING SIGNAL MEANS BEING EFFECTIVE TO CAUSE SAID OSCILLATOR TO OSCILLATE AT A PREDETERMINED BASE FREQUENCY, MODULATED IN ACCORDANCE WITH THE CHARACTERISTICS OF SAID MODULATING SIGNAL, (F) AND AN OUTPUT CIRCUIT CONNECTED TO SAID OSCILLATOR. 